Combined simulated data adaptation and piecewise linear transformation for robust speech recognition
نویسندگان
چکیده
This paper proposes a combination of simulated data adaptation and piecewise linear transformation (PLT) for robust continuous speech recognition. The original PLT selects an appropriate acoustic model using tree-structured HMMs and the acoustic model is adapted by the input speech in an unsupervised scheme. This adaptation can improve the acoustic model if the input speech is long enough and is correctly transcribed in the adaptation process. Indeed, an incorrect transcription can drastically degrade the acoustic model. Our proposed method increases the size of adaptation data by adding noise portions from the input speech to a set of pre-recorded clean speech, of which correct transcriptions are known. We investigate various configurations of the proposed method. Evaluations are performed with additive noisy continuous speech. The experimental results show that the proposed system reaches higher recognition rates than MLLR and PLT.
منابع مشابه
Maxium Likelihood Non-linear Transformation for Environment Adaptation in Speech Recognition Systems
In this paper, we describe an adaptation method for speech recognition systems that is based on a piecewise-linear approximation to a non-linear transformation of the feature space. The method extends a previously proposed non-linear transformation (NLT) technique by making the transformation function more sophisticated (piecewise-linear instead of piecewiseconstant), and by computing the trans...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملTree-structured noise-adapted HMM modeling for piecewise linear-transformation-based adaptation
This paper proposes the application of tree-structured clustering to various noise samples or noisy speech in the framework of piecewise-linear transformation (PLT)-based noise adaptation. According to the clustering results, a noisy speech HMM is made for each node of the tree structure. Based on the likelihood maximization criterion, the HMM that best matches the input speech is selected by t...
متن کاملA simulated-data adaptation technique for robust speech recognition
This paper proposes an efficient acoustic model adaptation method based on the use of simulated-data in maximum likelihood linear regression (MLLR) adaptation for robust speech recognition. Online MLLR adaptation is an unsupervised process which requires an input speech with phone labels transcribed automatically. Instead of using only the input signal in adaptation, our proposed simulated data...
متن کاملEnhancing children's speech recognition under mismatched condition by explicit acoustic normalization
Most commonly used model adaptation techniques employ linear/affine transformation on models/features to address the gross acoustic mismatch between the adults’ and the children’s speech data. Since all sources of acoustic mismatch may not be appropriately modeled by just linear transformation, in this work, the efficacy of our recently proposed explicit acoustic (pitch and speaking rate) norma...
متن کامل